Picture for Lingxiao Ma

Lingxiao Ma

TileLang: A Composable Tiled Programming Model for AI Systems

Add code
Apr 24, 2025
Viaarxiv icon

AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms

Add code
Feb 21, 2025
Viaarxiv icon

WaferLLM: A Wafer-Scale LLM Inference System

Add code
Feb 06, 2025
Viaarxiv icon

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Add code
Aug 12, 2024
Viaarxiv icon

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor

Add code
Aug 09, 2024
Viaarxiv icon

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Add code
Jun 25, 2024
Viaarxiv icon

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Add code
Feb 27, 2024
Viaarxiv icon

BitNet: Scaling 1-bit Transformers for Large Language Models

Add code
Oct 17, 2023
Viaarxiv icon

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement

Add code
Apr 08, 2023
Viaarxiv icon

SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation

Add code
Jan 26, 2023
Figure 1 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Figure 2 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Figure 3 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Figure 4 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Viaarxiv icon